[quantization][DRAFT] Disk space consumption improvements for full model quantization by stamalakhov · Pull Request #495 · Samsung/TICO

stamalakhov · 2026-02-16T12:52:28Z

This PR fixes population of static causal_masks`position_embeddings` through the layers to save disk space.

It precomputes static causal_mask/position_embeddings for using in llama/quant_decoder_layer to prevent populating every quantized decoder layer with these statically computed parameters to save disk space.

Using this PR circle model for HuggingFaceTB/SmolLM2-135M-Instruct is just 105MiB (vs 300 Mib of #492)

Draft: #436

TICO-DCO-1.0-Signed-off-by: s.malakhov s.malakhov@partner.samsung.com

This PR quantizes the full `LLama` model and converts it to circle format. TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>

This PR fixes population of static `causal_masks`\`position_embeddings` through the layers to save disk space. TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>

stamalakhov added 2 commits February 13, 2026 13:19

[quantization] Quantization of Llama

c0cdefb

This PR quantizes the full `LLama` model and converts it to circle format. TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>

[DRAFT] Improvements in disk space

a0b1d47

This PR fixes population of static `causal_masks`\`position_embeddings` through the layers to save disk space. TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>

stamalakhov self-assigned this Feb 16, 2026

stamalakhov changed the title ~~[quantization][DRAFT] Improvements in disk space for full model quantization~~ [quantization][DRAFT] Disk space consumption improvements for full model quantization Feb 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantization][DRAFT] Disk space consumption improvements for full model quantization#495

[quantization][DRAFT] Disk space consumption improvements for full model quantization#495
stamalakhov wants to merge 2 commits intoSamsung:mainfrom
stamalakhov:quant_full_model_impr_size

stamalakhov commented Feb 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stamalakhov commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

stamalakhov commented Feb 16, 2026 •

edited

Loading